Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add director notfiy method and use it for admin health changes #4186

Closed
wants to merge 3 commits into from

Conversation

nigoroll
Copy link
Member

@nigoroll nigoroll commented Sep 17, 2024

Pondering solutions for #4183 it became apparent that we currently have no way for core code to notify directors of changes, in particular changes of the admin_health. I looked into re-using the healthy callback (see #4184 for my failed attempt), but it is unsuitable because VRT_Healthy() rightly does not even call it for an "auto" admin_health.

So the cleanest option is another director callback, which is cheap and also helps us maintain clean layering.

This fixes the layer violation introduced with e46f972, where a director private pointer was used with a VBE specific function.

Fixes #4183

@dridi
Copy link
Member

dridi commented Sep 17, 2024

Your branch didn't start on the tip of master and conflicts with the change I both suggested and pushed after verifying my suspicion in a test case.

The "problem" with this change is the ABI breakage it introduces (the real problem being of course the bug you are trying to fix).

edit: it's probably fine to include as part of the 7.6 series with VRT 20.0, despite the 7.6.0 release

@nigoroll
Copy link
Member Author

Your branch didn't start on the tip of master and conflicts with the change I both suggested and pushed after verifying my suspicion in a test case.

Yes, I noticed, will adjust.

The "problem" with this change is the ABI breakage it introduces (the real problem being of course the bug you are trying to fix).

edit: it's probably fine to include as part of the 7.6 series with VRT 20.0, despite the 7.6.0 release

I think this change would call for a VRT_MINOR_VERSION bump only.

@nigoroll nigoroll force-pushed the vdi_notify branch 2 times, most recently from fe350b8 to 9a4e6ea Compare September 17, 2024 08:07
@nigoroll nigoroll marked this pull request as ready for review September 17, 2024 08:08
@dridi
Copy link
Member

dridi commented Sep 17, 2024

I think this change would call for a VRT_MINOR_VERSION bump only.

Absolutely not, this is changing the layout of struct vdi_methods and requires a rebuild!

@nigoroll
Copy link
Member Author

nigoroll commented Sep 17, 2024

Absolutely not, this is changing the layout of struct vdi_methods and requires a rebuild!

You are right. More specifically, a member is added, so sizeof() changes and without a rebuild, we would cause an out-of-bounds access. Thank you! (force-pushed VRT bump)

@delthas
Copy link
Contributor

delthas commented Sep 19, 2024

This causes a panic on a simple VTC, see #4189

@nigoroll
Copy link
Member Author

This causes a panic on a simple VTC, see #4189

fixed with 57e2a64

@nigoroll
Copy link
Member Author

OKed during bugwash

Comment on lines +142 to +143
static void
vbe_connwait_signal_all(const struct backend *bp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: vbe_connwait_broadcast()

Comment on lines 697 to +687
.destroy = vbe_destroy,
.panic = vbe_panic,
.list = vbe_list,
.healthy = vbe_healthy
.healthy = vbe_healthy,
.notify = vbe_notify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was bothered by the notify method because this is a breaking change as we established earlier and I wasn't satisfied with such a resolution on the 7.6 branch where we'd rather not break VRT if we can help it.

It then occurred to me that in spirit the new notify method is very close to the event method and I was wondering whether we'd agree to the creation of a new going-sick event even though that was not the original scope of the event enum.

I don't personally see a strong reason against widening the scope beyond the existing set of events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that yes, no or something else, @dridi ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, phk also brought up using the .event method, is this what you mean? I can look at it, but it did not seem natural to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting adding a new enum entry for the event callback, one that would be dedicated to backend/directors, instead of adding a new notify callback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: I did consider using the .event() method, but it seemed confusing to me because it is tied to the VCL temperature / lifetime. But as both you and phk do not share this concern, I will implement this to have something to look at.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Pondering solutions for varnishcache#4183 it became apparent that we currently have no way
for core code to notify directors of changes, in particular changes of the
admin_health. I looked into re-using the healthy callback, but it is unsuitable
because VRT_Healthy() does not even call it for an "auto" admin_health.

So the cleanest option is another director callback, which is cheap and also
helps us maintain clean layering.
Using the new facility, we can now selectively notify interested backend types
(so far: VBE only) when the admin health changes.

This fixes the layer violation introduced with
e46f972, where a director private pointer was
used with a VBE specific function.

Fixes varnishcache#4183
@nigoroll
Copy link
Member Author

The alternative implementation with a VDI_EVENT is in #4196

@nigoroll
Copy link
Member Author

obsoleted by #4196

@nigoroll nigoroll closed this Sep 30, 2024
@nigoroll nigoroll deleted the vdi_notify branch September 30, 2024 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic on varnishadm backend.set_health of dynamic backend
3 participants